Rfpred: a Random Forest Approach for Prediction of Missense Variants in Human Exome

نویسنده

  • FABIENNE JABOT-HANIN
چکیده

Exome sequencing is becoming a standard tool for gene mapping of genetic diseases. Given the vast amount of data generated by Next Generation Sequencing techniques, identification of disease causal variants is like finding a needle in a haystack. The impact assessment and the prioritization of potential pathogenic variants are expected to reduce work in biological validation, which is long and costly. One of the possible approaches to determine the most probable deleterious variants in individual exomes is to use protein function alteration prediction. We propose in this paper to use a machine learning approach, the random forest to build a new meta-score based on five previously described scores (SIFT, Polyphen2, LRT, PhyloP and MutationTaster) and compiled in the dbNSFP database. The functional meta-score was trained on a dataset of 61 500 non-synonymous Single Nucleotide Polymorphisms (SNPs). The random forest method (rfPred) appears to be globally better than each of the classifiers separately or in combination in a logistic regression model, and better than a newly described score (CADD) on independent validation sets. RfPred scores have been pre-calculated for all the possible non-synonymous SNPs of human exome and are freely accessible at the web-server http://www.sbim.fr/rfPred/

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational approach towards identification of pathogenic missense mutations in AMELX gene and their possible association with amelogenesis imperfecta

Amelogenin gene (AMEL-X) encodes an enamel protein called amelogenin, which plays a vital role in tooth development. Any mutations in this gene or the associated pathway lead to developmental abnormalities of the tooth. The present study aims to analyze functional missense mutations in AMEL-X genes and derive an association with amelogenesis imperfecta. The information on miss...

متن کامل

Calculating rfPred scores with package rfPred

Exome sequencing is becoming a standard tool for gene mapping of monogenic diseases. Given the vast amount of data generated by Next Generation Sequencing techniques, identification of disease causal variants is like finding a needle in a haystack. The impact assessment and the prioritization of potential pathogenic variants are expected to reduce work in biological validation, which is long an...

متن کامل

P-125: Identification of Novel Missense Mutations of The TGFBR3 Gene in Chinese Women with Premature Ovarian Failure

Background The aim of this study was to assess the ssociation between human transforming growth factor b receptor,type III (TGFBR3) and idiopathic premature ovarian failure (POF) in a Chinese population. MaterialsAndMethods A total of 112 Chinese women with idiopathic POF and 110 normal controls were examined. DNA samples prepared from blood leukocytes were used as templates for polymerase-chai...

متن کامل

Analysis of Missense Mutations of CX3CR1 Gene in Patients with Recurrent Pregnancy Loss Using Bioinformatics Tools

Introduction: Abortion is a common complication that refers to the early termination of pregnancy with the death of the fetus before the 20th week of pregnancy. Previous studies show that many genes are involved in this disease, including the CX3CR1 gene, which is one of the inflammatory response genes in the immune system. The pathogenicity of these variants was determined in this study using ...

متن کامل

Whole Exome Sequencing Reveals a BSCL2 Mutation Causing Progressive Encephalopathy with Lipodystrophy (PELD) in an Iranian Pediatric Patient

Background: Progressive encephalopathy with or without lipodystrophy is a rare autosomal recessive childhood-onset seipin-associated neurodegenerative syndrome, leading to developmental regression of motor and cognitive skills. In this study, we introduce a patient with developmental regression and autism. The causative mutation was found by exome sequencing. Methods: The proband showed a gener...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016